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ARITHMETIC UNIT AND METHOD FOR DATA STORAGE AND READING 

BACKGROUND OF THE INVENTION 

The present invention relates to an arithmetic unit for 
5 executing an arithmetic process with respect to data in which 
a word is not standard 2 n -bit wide. 

Some devices for outputting or processing image and audio, 
use data in which a word is not standard 2 n -bit wide. This is 
for improving image and audio quality, or for providing image 

10 and audio with any additional information. Such devices are 
exemplified by so-called third generation mobile phones or 
information processors for generating image data with various 
tones. This type devices include a digital signal processor 
(hereinafter, referred to as DSP) or other type of arithmetic 

15 unit. Using such devices, a bit width for a word is converted 
into a standard 2 n -bit for various arithmetic processes. 

In the below, by taking a DSP mounted in a third generation 
mobile phone as an example, an arithmetic unit of a conventional 
type is described. 

2 0 The third generation mobile phone uses a DSP for extracting 

signals in several specific bands out of a wide frequency band, 
for achieving communications always good in condition. From 
the extraction result, one specific frequency band especially 
high in reception sensitivity is selected for communications. 

25 Here, for such signal extraction, the DSP uses a technique called 
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digital matched filter (hereinafter, referred to as DMF) 
algorithm, which will be described later, to emphasize path 
intensity of signals in specific bands, and thereby extracts 
signals in the specific bands. 
5 The DSP of a conventional type outputs 16-bit data at a 

time from memory to an arithmetic logic unit (hereinafter, 
referred to as ALU). The issue here is that, an arithmetic 
process does not require all of the 16-bit data but only 10-bit. 
Itmeans that the conventional arithmetic unit wastefully outputs 

10 6-bit data to the ALU at a time. 

The ALU includes a 32-bit-wide arithmetic section (not 
shown) , but is only utilizing a part thereof, i.e. , section of 
10-bit wide. It means that the arithmetic section of 22-bit 
wide goes to waste with the conventional arithmetic unit. 

15 As such, when executing an arithmetic process using data 

in which a word is not standard 2 n -bit wide, the arithmetic unit 
of the conventional type causes waste of an arithmetic section 
in an ALU or memory usage. This is because of an unused part 
arranged between I part data and R part data. Especially, it 

20 is a problem that at the time of executing an operation process 
using the DMF algorithm, efficient use of arithmetic capability 
and memory cannot be fully achieved. 

SUMMARY OF THE INVENTION 
25 An arithmetic unit of the present invention includes a 
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memory, an arithmetic logic unit, a register and a combining 
circuit. The arithmetic logic unit executes a predetermined 
arithmetic operation with respect to the data read from memory. 
The register temporarily stores the data read from the memory. 
5 The combining circuit selects one of the arithmetic logic unit 
and the register. The combining circuit replaces a part of the 
data read from the memory with output data received from the 
selected one of the arithmetic logic unit and the register. 

10 BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. 1 is a diagram showing the structure of an arithmetic 
unit of the present invention; 

Fig . 2 is a diagram showing arrangement of data to be stored 
in two memory blocks; 
15 Fig . 3 is a diagram showing arrangement of data to be stored 

in two memory blocks; 

Fig . 4 is a diagram showing arrangement of data to be stored 
in two memory blocks; 

Fig . 5 is a diagram showing arrangement of data to be stored 
2 0 in two memory blocks; 

Fig . 6 is a diagram showing arrangement of data to be stored 
in two memory blocks; 

Fig . 7 is a diagram showing arrangement of data to be stored 
in two memory blocks; 
2 5 Fig . 8 is a diagram showing arrangement of data to be stored 
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in two memory blocks; 

Fig . 9 is a diagram showing arrangement of data to be stored 
in two memory blocks; 

Fig. 10 is a diagram showing arrangement of data to be 
stored in two memory blocks; 

Fig. 11 is a diagram showing arrangement of data to be 
stored in two memory blocks; 

Fig. 12 is a diagram showing arrangement of data to be 
stored in two memory blocks; 

Fig. 13 is a diagram showing change of data to be stored 
in two memory blocks; 

Fig. 14 is a diagram showing the internal structure of 
an ALU of the present embodiment; 

Fig. 15 is a diagram showing change of output from a 
combining circuit; 

Fig. 16 is a diagram showing arrangement of cyclic data; 

Fig. 17 is a diagram roughly showing a DMF algorithm; and 

Fig. 18 is a diagram roughly showing the DMF algorithm. 

DETAILED DESCRIPTION OF THE INVENTION 

An embodiment of the present invention is aiming the 
reduction of memory usage amount to two-third by using a 16-bit 
register and a combining circuit, and by having a control section 
performed special control. Here, the memory usage amount is 
specifically the one at the time of executing an arithmetic 
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process using the DMF algorithm. 

Fig. 1 is a diagram showing the structure of an arithmetic 
unit of the present invention. 

The arithmetic unit of the invention is so structured as 
5 to allow, at high speed with less memory, an arithmetic operation 
of data in which a word is not standard 2 n -bit wide. 

The embodiment of the invention is now described below 
by taking a DSP for a third generation mobile phone as an example . 
The accompanying drawings are all schematic intended only to 
10 provide overall understanding of the present invention . In each 
drawing, any common component is provided with the same reference 
numeral, and not described twice. 

A DSP of the present embodiment includes, as shown in Fig. 
1, a register 27 and a combining circuit 29. The register 27 
15 is provided for temporarily storing data coming from memory 17 . 
As to the data coming from the memory 17 , i.e., the data temporarily 
stored in the register 27, the combining circuit 29 is provided 
for partially replacing it with data coming f roman ALU13 ' . Here , 
similarly to other components, the register 27 and the combining 
20 circuit 29 are both under the control of a control section, which 
is not shown. The ALU 13' of the embodiment is provided with 
a function of dividing a carry signal at an arbitrary position 
responding to a division signal K . This will be described later . 

In the embodiment, the memory 17 is so designed as to be 
25 substitutable with general-purpose memory already quite popular 
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on the market. Herein, data to be stored in the memory 17 is 
so arranged as shown in Figs. 2 to 12 . The control section (not 
shown) performs special control the memory 17 to make an 
arithmetic operation easier for the ALU 13 ' . 
5 Figs. 2 to 12 are diagrams showing arrangement of data 

to be stored in two memory blocks. Specifically, Fig. 2 shows 
approximate arrangement of data to be stored in the memory 17, 
and Figs. 3 to 12 each show detailed arrangement of data to be 
stored in the memory 17. In Figs. 3 to 12, w [-In-] ff denotes 

10 a region in which I part data is stored, " [-Rn-] " denotes a region 
in which R part data is stored, and denotes a region in which 
unused data is stored. 

As shown in Figs. 2 to 12, the memory 17 successively 
performs data storage into first and second 16-bit memory blocks 

15 19 and 21. Here, stored is 24-bit data, which is a combination 
of 10-bit I part data, 10-bit R part data, and 4-bit unused data . 
Such data storage in the embodiment successfully reduces an 
unused region of the memory 17 from 12/32=37.5% to 4/32=12.5%. 

Note here that, the unused data is preferably so changed 

20 in bit width that data subsequent thereto starts from 0th or 
8th bit with the reasons described later. In example of Fig. 
2, the unused data is 4 bits in width because both I part data 
and R part data are 10 bits in width. If the I part data or 
the R part data is changed in bit width, the unused data is 

25 preferably changed in bit width correspondingly so that the 
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subsequent data starts from Oth or 8th bit. 

In the present embodiment, the control section (not shown) 
performs the following control to the memory 17. 

As an example, the memory 17 stores such data as shown 
in Figs . 3 to 12 . 

The control section provides an input A Q to the ALU 13 ' 
based on the DMF algorithm shown in Figs. 17 and 18. Herein, 
the input A Q is presumed as being 10 part data and R0 part data. 
The 10 part data and R0 part data are stored at addresses 0 and 
1 of the memory 17 . The control section causes the data stored 
at addresses 0 and 1 of the memory 17 to go to the ALU 13' via 
a shifter 11, and also to the register 27 for temporary storage. 
Then, the control section causes the ALU 13' to execute an 
arithmetic process, and the arithmetic result is temporarily 
stored in an accumulator (hereinafter, referred to as Acc) 15. 
Thereafter, the control section causes the register 27 to output 
the data temporarily stored therein to the combining circuit 
29, and the Acc 15 to output the arithmetic result temporarily 
stored therein to the combining circuit 2 9 . The data thus output 
to the combining circuit 2 9 is combined under the control to 
be described later, and the combination result is then output 
to the memory 17 . The memory 17 stores thus received combination 
result at original addresses 0 and 1. 

The control section supplies an output Ai to the ALU 13 ' 
based on the DMF algorithm. Here, the output Ai is of a delay 
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value Di. In this example, because the delay value Di denotes 
128 cycles, the output Ai is data stored with 0+128=128 cycles 
delayed. That is, the data is 112 8 part data andRl2 8 part data, 
which are stored at addresses 192 and 193 of the memory 17, 
5 respectively. The control section provides data thus stored 
at addresses 192 and 193 to the ALU 13' via the shifter 11. The 
data is also provided to the register 27 for temporary storage 
therein. Next, the control section has the ALU 13' executed 
an arithmetic process, and the arithmetic result is temporarily 

10 stored in the Acc 15 . The data temporarily stored in the register 
27 is output to the combining circuit 29, and the arithmetic 
result temporarily stored in the Acc 15 is also output to the 
combining circuit 29 . The data andresult output to thecombining 
circuit 29 is combined therein under the control which will be 

15 described later, and the combination result is output to the 
memory 17. Then, the memory 17 is so controlled as to store 
the combination result at original addresses 192 and 193. 

The control section also supplies an output A 2 to the ALU 
13' based on the DMF algorithm. Herein, the output A 2 is of 

20 a delay value D 2 . In this example, because the delay value D 2 
denotes 64 cycles, the output A 2 is data stored with 128 + 64 = 192 
cycles delayed. That is, the data is 1192 part data and R192 
part data, which are stored at addresses 288 and 289 of the memory 
17 , respectively. The control sectionprovides data thus stored 

25 at addresses 288 and 289 of the memory 17 to the ALU 13' via 
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the shifter 11. The data is also provided to the register 27 
for temporary storage. Next, the control section has the ALU 
13' executed an arithmetic process, and the arithmetic result 
is temporarily stored in the Acc 15 . The data temporarily stored 
5 in the register 27 is output to the combining circuit 29, and 
the arithmetic result temporarily stored in the Acc 15 is also 
output to the combining circuit 29. The data and result output 
to the combining circuit 2 9 are combined therein under the control 
which will be described later, and the combination result is 
10 output to the memory 17. Then, the memory 17 is so controlled 
as to store the combination result at original addresses 288 
and 289 . 

The control section also supplies an output A 3 to the ALU 
13' based on the DMF algorithm. Herein, the output A 3 is of 

15 a delay value D 3 . In this example, because the delay value D 3 
denotes 16 cycles, the output A 3 is data stored with 192 + 16=208 
cycles delayed. That is, the data is 1208 part data and R208 
part data, which are stored at addresses 312 and 313 of the memory 
17 , respectively . The control section provides data thus stored 

20 at addresses 312 and 313 of the memory 17 to the ALU 13' via 
the shifter 11. The data is also provided to the register 27 
for temporary storage. Next, the control section has the ALU 
13' executed an arithmetic process, and the arithmetic result 
is temporarily stored in the Acc 15 . The data temporarily stored 

25 in the register 27 is output to the combining circuit 29, and 
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the arithmetic result temporarily stored in the Acc 15 is also 
output to the combining circuit 29. The data and result output 
to the combining circuit 2 9 are combined therein under the control 
which will be described later, and the combination result is 
5 output to the memory 17. Then, the memory 17 is so controlled 
as to store the combination result at original addresses 312 
and 313. 

The control section also supplies an output A 4 to the ALU 
13' based on the DMF algorithm. Herein, the output A 4 is of 

10 a delay value D 4 . In this example, because the delay value D 4 
denotes 32 cycles, the output A 4 is data stored with 208 + 32=240 
cycles delayed. That is, the data is 1240 part data and R240 
part data, which are stored at addresses 3 60 and 3 61 of the memory 
17 , respectively. The control section provides data thus stored 

15 at addresses 360 and 361 of the memory 17 to the ALU 13' via 
the shifter 11. The data is also provided to the register 27 
for temporary storage. Next, the control section has the ALU 
13' executed an arithmetic process, and the arithmetic result 
is temporarily stored in the Acc 15 . The data temporarily stored 

20 in the register 27 is output to the combining circuit 29, and 
the arithmetic result temporarily stored in the Acc 15 is also 
output to the combining circuit 29. The data and result output 
to the combining circuit 2 9 are combined therein under the control 
which will be described later, and the combination result is 

25 output to the memory 17. Then, the memory 17 is so controlled 
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as to store the combination result at original addresses 360 
and 3 61. 

The control section also supplies an output A 5 to the ALU 
13' based on the DMF algorithm. Herein, the output A 5 is of 
5 a delay value D 5 . In this example, because the delay value D 5 
denotes 8 cycles, the output A 5 is data stored with 240+8=248 
cycles' delayed. That is, the data is 1248 part data and R248 
part data, which are stored at addresses 372 and 373 of the memory 
17 , respectively . The control section provides data thus stored 

10 at addresses 372 and 373 of the memory 17 to the ALU 13' via 
the shifter 11. The data is also provided to the register 27 
for temporary storage. Next, the control section has the ALU 
13' executed an arithmetic process, and the arithmetic result 
is temporarily stored in the Acc 15 . The data temporarily stored 

15 in the register 27 is output to the combining circuit 29, and 
the arithmetic result temporarily stored in the Acc 15 is also 
output to the combining circuit 29. The data and result output 
to the combining circuit 2 9 are combined therein under the control 
which will be described later, and the combination result is 

20 output to the memory 17. Then, the memory 17 is so controlled 
as to store the combination result at original addresses 372 
and 373 . 

The control section also supplies an output A 6 to the ALU 
13' based on the DMF algorithm. Herein, the output A6 is of 
2 5 a delay value D 6 . In this example, because the delay value D 6 
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denotes 1 cycle, the output A 6 is data stored with 248+1=249 
cycles delayed. That is, the data is 1249 part data and R249 
part data, which are stored at addresses 373 and 374 of the memory 
17 , respectively . The control section provides data thus stored 
at addresses 373 and 374 in the memory 17 to the ALU 13' via 
the shifter 11. The data is also provided to the register 27 
for temporary storage. Next, the control section has the ALU 
13' executed an arithmetic process, and the arithmetic result 
is temporarily stored in the Acc 15 . The data temporarily stored 
in the register 27 is output to the combining circuit 29, and 
the arithmetic result temporarily stored in the Acc 15 is also 
output to the combining circuit 29. The data and result output 
to the combining circuit 29 are combined therein under the control 
which will be described later, and the combination result is 
output to the memory 17. Then, the memory 17 is so controlled 
as to store the combination result at original addresses 373 
and 374. 

The control section also supplies an output A 7 to the ALU 
13' based on the DMF algorithm. Herein, the output A 7 is of 
a delay value D 7 . In this example, because the delay value D 7 
denotes 4 cycles, the output A 7 is data stored with 249 + 4=253 
cycles delayed. That is, the data is 1253 part data and R253 
part data, which are stored at addresses 379 and 380 of the memory 
17 , respectively. The control sectionprovides data thus stored 
at addresses 379 and 380 in the memory 17 to the ALU 13' via 
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the shifter 11. The data is also provided to the register 27 
for temporary storage. Next, the control section has the ALU 
13' executed an arithmetic process, and the arithmetic result 
is temporarily stored in the Acc 15 . The data temporarily stored 
in the register 27 is output to the combining circuit 29, and 
the arithmetic result temporarily stored in the Acc 15 is also 
output to the combining circuit 29. The data and result output 
to the combining circuit 2 9 are combined therein under the control 
which will be described later, and the combination result is 
output to the memory 17. Then, the memory 17 is so controlled 
as to store the combination result at original addresses 379 
and 380 . 

The control section also supplies an output A 8 to the ALU 
13' based on the DMF algorithm. Herein, the output A 8 is of 
a delay value D 8 . In this example, because the delay value D 8 
denotes 2 cycles, the output A 8 is data stored with 253+2=255 
cycles delayed. That is, the data is 1255 part data and R255 
part data, which are stored at addresses 3 82 and 3 83 of the memory 
17 , respectively. The control section provides data thus stored 
at addresses 382 and 383 in the memory 17 to the ALU 13' via 
the shifter 11. The data is also provided to the register 27 
for temporary storage. Next, the control section has the ALU 
13' executed an arithmetic process, and the arithmetic result 
is temporarily stored in the Acc 15 . The data temporarily stored 
in the register 27 is output to the combining circuit 29, and 
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the arithmetic result temporarily stored in the Acc 15 is also 
output to the combining circuit 29. The data and result output 
to the combining circuit 29 are combined therein under the control 
which will be described later, and the combination result is 
5 output to the memory 17. Then, the memory 17 is so controlled 
as to store the combination result at original addresses 382 
and 3 83 . 

Thereafter, the 1255 part data and the R255 part data at 
address 3 83 are provided with the input A 0/ which is of the next 
10 delay value Di . During the next arithmetic process , the control 
section regards address 3 83 as new address 0, and operates 
similarly to the above. 

In the above operation, data changes in three patterns 
as shown in Fig. 13 . Specifically, Fig. 13 is a diagram showing 
15 change of data to be stored in two memory blocks . In the drawing, 
a blank region indicates a part in which data is to be updated, 
and a diagonally shaded region indicates a part in which no data 
is to be updated. 

In each pattern of change in Fig. 13, components in the 
2 0 DSP operate as follows. 

In a pattern 1 of Fig. 13, first, the control section reads 
data stored at address n from the first memory block 19. Also, 
the control section reads data stored at address n+1 from the 
second memory block 21 . Thus read data are output to the shifter 
25 11 and the register 27. The data stored at address n is a part 
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of a combination of the 10 part data and R0 part data. The data 
stored at address n+1 is a part a combination of the RO data, 
the unused data, and II part data. 

The shifter 11 performs phase adjustment by shifting, by 
5 a predetermined number of bits , data coming from the accumulator 
15, the memory 17, and the like. 

Fig. 14 is a diagram showing the internal structure of 
the ALU 13 ' of the present embodiment . In Fig . 14 , A and B denote 
data coming from the first and second memory blocks 19 and 21 
10 via the shifter 11 , C denotes a carry signal , K a division signal , 
X an output signal, and FA an add operation circuit. 

The ALU 13' divides the data coming from the shifter 11 
into data including 10 part data and RO part data (hereinafter, 
referred to as arithmetic data) and other data (non-arithmetic 
15 data) . Such a division is performed based on a division signal 
K coming from the control section, and only when the division 
signal K is indicating 0 . Here, the arithmetic data corresponds 
to an output signal X shown in Fig. 14. 

The ALU 13' receives, from the accumulator 15, an 
2 0 arithmetic result under the same cycle as that of the current 
arithmetic process. Thus received arithmetic result is 
referred to as last cycle arithmetic result , and is data including 
both I part data and R part data . The ALU 13 ' uses the arithmetic 
data received from the shifter 11 and the last cycle arithmetic 
25 result from the accumulator 15 to execute an arithmetic operation 
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under the above-described DMF algorithm. In Fig. 13, the 
arithmetic result derived thereby is indicated as combination 
data of 10' part data and R0 ' part data. 

The arithmetic result is output from the ALU 13' to the 
5 accumulator 15 for storage, and for output to the combining 
circuit 29 along a first route 23. Also, the arithmetic result 
is output to the ALU 13 ' along a second route 2 5 at a predetermined 
timing under the control of the control section. 

On the other hand, under the control of the control section, 
10 the register 27 outputs the data coming from the memory 17 at 
a predetermined timing to the combining circuit 29 along a third 
route 23 ' . 

As to the data coming from the register 27 along the third 
route 23 ' , under the control of the control section, the combining 

15 circuit 29 performs data division at 8-bit intervals. Out of 
thus divided data, the arithmetic data (i.e. , low-order 8 bits 
and high-order 8 bits stored in the first memory block 19, and 
the low-order 8 bits stored in the second memory block 21) is 
replaced with the data coming from the ALU 13' along the first 

20 route 23. To the resulting data, the non-arithmetic data (i.e., 
high-order 8 bits stored in the second memory block 21) is added 
to generate output data . Thus generated output data is forwarded 
to the memory 17 . 

The memory 17 stores the output data coming from the 

25 combining circuit 29 at original addresses . More specifically, 
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the low-order 16 bits of the output data are stored at address 
n of the first memory block 19 , and the high-order 16 bits thereof 
at address n+1 of the second memory block 21. 

After the data storage, with the pattern 1, as to the data 
5 stored in the first and second memory blocks 19 and 21, only 
the high-order 8 bits stored in the second memory block 21 remain 
the same, but the rest are updated to new values. 

In a pattern 2 shown in Fig. 13, the control section reads 
data storedat address n+1 from the second memory block 21 . Also, 

10 the control section reads the data stored at address n+2 from 
the first memory block 19. Thus read data are output to the 
shifter 11 and the register 27 . Here, the data stored at address 
n+1 is a part of the R0 ' part data, the unused data, and a part 
of the II part data. The data stored at address n+2 is a part 

15 of the II part data, a part of Rl part data, and the unused data. 

With respect to the data coming from the accumulator 15, 
the memory 17, and the like, the shifter 11 performs phase 
adjustment by shifting a predetermined number of bits. The 
result is then output to the ALU 13 ' . 

20 The ALU 13' divides the data coming from the shifter 11 

into arithmetic data and non-arithmetic data based on a division 
signal K. 

The ALU 13' then receives, from the accumulator 15, the 
last cycle arithmetic result. Using the arithmetic data 
25 received from the shifter 11 and the last cycle arithmetic result 
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from the accumulator 15, the ALU 13' executes an arithmetic 
operation under the DMF algorithm . Fig . 13 shows the arithmetic 
result derived thereby as combination data of II' part data and 
Rl ' part data . 

5 The arithmetic result is output from the ALU 13' to the 

accumulator 15 for storage, and for output to the combining 
circuit 29 along the first route 23 . Also, the arithmetic result 
is output to the ALU 13 ' along the second route 25 at a 
predetermined timing under the control of the control section. 
10 On the other hand, under the control of the control section, 

the register 27 outputs the data coming from the memory 17 at 
a predetermined timing along the third route 23 ' to the combining 
circuit 29. 

As to the data coming from the register 27 along the third 
15 route 23 ' , under the control of the control section, the combining 
circuit 29 performs data division at 8-bit intervals. Out of 
thus divided data, the arithmetic data (i.e., the high-order 
8 bits stored in the second memory block 21, and high-order 8 
bits and low-order 8 bits stored in the first memory block 19) 
20 is replaced with the data coming from the ALU 13 ' along the first 
route 23. To the resulting data , the non-arithmetic data (i.e., 
low-order 8 bits stored in the second memory block 21) is added 
to generate output data . Thus generated output data is forwarded 
to the memory 17 . 
25 The memory 17 stores the output data coming from the 
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combining circuit 29 at original addresses . More specifically, 
the low-order 16 bits of the output data are stored at address 
n+1 of the second memory block 21, and the high-order 16 bits 
thereof at address n+2 of the first memory block 19. 
5 After the data storage, with the pattern 2, as to the data 

stored in the first and second memory blocks 19 and 21, only 
the low-order 8 bits stored in the second memory block 21 remain 
the same, but the rest are updated to new values. 

In a pattern 3 shown in Fig. 13, the control section reads 

10 data stored at address n+3 from the second memory block 21 . Also, 
the control section reads the data stored at address n+4 from 
the first memory block 19. Thus read data are output to the 
shifter 11 and the register 27 . Here, the data stored at address 
n+3 is 12 part data, and a part of R2 part data. The data stored 

15 at address n+4 is a part of the R2 part data, the unused data, 
and 13 part data. 

With respect to the data coming from the accumulator 15, 
the memory 17, and the like, the shifter 11 performs phase 
adjustment by shifting a predetermined number of bits. The 

20 result is then output to the ALU 13'. 

The ALU 13' divides the data coming from the shifter 11 
into arithmetic data and non-arithmetic data based on a division 
signal K. 

The ALU 13' then receives, from the accumulator 15, the 
25 last cycle arithmetic result. Using the arithmetic data 
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received from the shifter 11 and the last cycle arithmetic result 
from the accumulator 15, the ALU 13' executes an arithmetic 
operation under the DMF algorithm . Fig . 13 shows the arithmetic 
result derived thereby as combination data of 12' part data and 
5 R2 ' part data. 

The arithmetic result is output from the ALU 13 ' to the 
accumulator 15 for storage, and for output to the combining 
circuit 2 9 along the first route 23 . Also , the arithmetic result 
is output to the ALU 13' along the second route 25 at a 
10 predetermined timing under the control of the control section. 

On the other hand, under the control of the control section, 
the register 27 outputs the data coming from the memory 17 at 
a predetermined timing along the third route 23 ' to the combining 
circuit 29. 

15 As to the data coming from the register 27 along the third 

route 23', under the control of the control section, the combining 
circuit 29 performs data division at 8-bit intervals. Out of 
thus divided data, the arithmetic data (i.e. , low-order 8 bits 
and high-order 8 bits stored in the second memory block 21, and 

20 the low-order 8 bits stored in the first memory block 19) is 
replaced with the data coming from the ALU 13' along the first 
route 23. To the resulting data, the non-arithmetic data (i.e., 
high-order 8 bits stored in the first memory block 19) is added 
to generate output data . Thus generated output data is forwarded 

25 to the memory 17. 
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The memory 17 stores the output data coming from the 
combining circuit 29 at original addresses . More specifically, 
the low-order 16 bits of the output data are stored in address 
n+3 of the second memory block 21, and the high-order 16 bits 
5 thereof at address n+4 of the first memory block 19. 

After the data storage, with the pattern 3, as to the data 
stored in the first and second memory blocks 19 and 21, only 
the high-order 8 bits stored in the first memory block 19 remain 
the same, but the rest are updated to new values. 

10 Fig. 15 is a diagram showing change of output data in the 

combining circuit 29. In Fig. 15, Acc[23:0] and Reg[7:0] on 
the upper left, and Reg [15: 8] and Acc[23:0] on the lower left 
are data generated through combination, by the combining circuit 
29, of outputs from the Acc 15 and the register 27. Further, 

15 Out [31:0] on the right side is output data to be output to the 
memory 17 after selecting by the combining circuit 2 8 any one 
of Acc [23:0] and Reg [7:0] on the upper left, and Reg [15: 8] and 
Acc [23:0] on the lower left. This selection is made based on 
original addresses of the data read from the memory 17. 

2 0 Here, Acc[x:y] represents output data from the accumulator 

15, between xth bit and yth bit. Reg[x:y] represents output 
data from the register 27, between xth bit and yth bit. For 
example, Acc [23:0] andReg [7:0] represent a combination of 24-bit 
output data from the accumulator 15 between 0th bit and 23rd 

25 bit, and 8-bit output data from the register 27 between 0th bit 
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and 7th bit. Similarly, Out[x:y] represents output data from 
the combining circuit 2 9 between xth bit andythbit . For example, 
Out [31:0] represents 32-bit output data from the combining 
circuit 29 between 0th bit and 31st bit. 
5 For every arithmetic process, the control section provides 

1255 part data and R255 part data at address 383 with the input 
A 0 of the next delay value Di . In the next arithmetic process, 
the control section regards address 3 83 as new address 0 for 
process execution. This is equivalent to executing the process 
10 witharef erencepoint moved to the left by one under the assumption 
that addresses 0 and 3 83 of the memory 17 are connected as a 
ring. This successfully allows the DSP to output data for 
arithmetic operation in a preferable manner with simpler control . 

Such a control is described in more detail referring to 
15 Fig. 16. Fig. 16 is a diagram showing arrangement of cyclic 
data. In the drawing, a diagonally shaded region indicates a 
section to which data is input. 

The control section cyclically uses data stored at a given 
address of the memory 17 determined by the DMF algorithm as output 
2 0 data of the delay values Di , D 2 , D 3 , D 4 , D 5 , D 6/ D 7 , and D 8 . For 
every arithmetic process, these data are updated to a result 
derived by arithmetic operation using the patterns 1 to 3 as 
output data of each of the delay values Di to D 8 . Here, the output 
data of the last delay value D 8 will be head input data D 0 of 
25 the delay value D 0 for the next arithmetic process. In such 
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a manner, the arithmetic process in a cycle of the DMF algorithm 
is executed. Then, the control section moves the reference 
position of Fig. 16 to the left by a predetermined amount (one 
in this example) before executing the next arithmetic process. 
5 The arithmetic process at this time is similar in operation to 
the above. In this manner, the control section can read input 
and output corresponding to the delay values successively and 
easily from the memory 17'. Such a function can be easily realized 
by utilizing modulo addressing, which is a standard provision 

10 in the DSP. 

Described next is the operation of components in the DSP. 
Herein, the ALU 13 ' is able to cut a carry signal C at an arbitrary 
bit position based on a division signal K or a register value. 
In this embodiment, the ALU 13' cuts the carry signal at 10-bit 

15 intervals. 

If the data stored in the memory 17 has the structure of 
pattern 1 of Fig . 13 , the components in the DSP operate as follows . 

First, the control section reads 32-bit data in total from 
predetermined addresses of the first and second memory blocks 

20 19 and 21. Thus read data is output to the shifter 11 and the 
register 27 . Without shifting, the shifter 11 outputs the data 
at it is to the ALU 13'. The register 27 temporarily stores 
the data read out from predetermined addresses of the first and 
second memory blocks 19 and 21. 

25 Then, the ALU 13' executes the arithmetic process, and 



-23- 



F01ED0306 

outputs the result to Acc 0 and Acci. 

After temporarily storing the arithmetic result, the Acc 0 
and Acci output the result to the shifter 11. The shifter 11 
outputs the result as it is to the Acc 0 and Acci via the ALU 
5 13'. The Acc 0 and Acci temporarily store the result again. 

Thereafter, the register 27 outputs the data in its storage 
to the combining circuit 29, and at the same time, the Acc 0 and 
Acci output the arithmetic result derived by the ALU 13' to the 
combining circuit 29. 

10 Next, the combining circuit 29 combines the data stored 

in the register 27 and the arithmetic result derived by the ALU 
13 ' and stored in the Acc 0 and Acci, and outputs the combination 
result to the memory 17. Here, the combination is so done as 
to make 8 least significant bits (LSB) as the data stored in 

15 the register 27 . The combination result is stored by the memory 
17 into original addresses of the first and second memory blocks 
19 and 21. 

If the data stored in the memory 17 has the structure of 
pattern 2 of Fig . 13 , the components in the DSP operate as follows . 
20 First, the control section reads 32-bit data in total from 

predetermined addresses of the first and second memory blocks 
19 and 21. Thus read data is output to the shifter 11 and the 
register 27. The shifter 11 

outputs the data to the ALU 13' after shifting the data by 8 
25 bits to the right. The register 27 temporarily stores the data 
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read out from predetermined addresses of the first and second 
memory blocks 19 and 21. 

Then, the ALU 13' executes the arithmetic process, and 
outputs the arithmetic result to the Acc 0 and Acci . After 
5 temporarily storing the result, the Acc 0 and Acci output the 
result to the shifter 11. The shifter 11 outputs the result, 
after shifting 8 bits to the left, to Acc 0 and Acci via the ALU 
13' . The Acc 0 and Acci temporarily store thus derived result 
again. 

10 Thereafter, the register 27 outputs the data in its storage 

to the combining circuit 29, and at the same time, the Acc 0 and 
Acci output the result by the ALU 13' shifted by 8 bits to the 
left to the combining circuit 29. 

Next, the combining circuit 29 combines the data stored 

15 in the register 27 and the result derived by the ALU 13' and 
stored in the Acc 0 and Acci, and outputs the combination result 
to the memory 17. Here, the combination is so done as to make 
8 most significant bits (MSB) as the data stored in the register 
27. The combination result is stored by the memory 17 into 

20 original addresses of the first and second memory blocks 19 and 
21. 

If the data stored in the memory 17 has the structure of 
pattern 3 of Fig . 13 , the components in the DSP operate as follows . 

First, the control section reads 32 -bit data in total from 
2 5 predetermined addresses of the first and second memory blocks 
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19 and 21. Thus read data is output to the shifter 11 and the 
register 27. Without shifting, the shifter 11 outputs the data 
as it is to the ALU 13'. The register 27 temporarily stores 
the data read out from predetermined addresses of the first and 
5 second memory blocks 19 and 21. 

Then, the ALU 13' executes the arithmetic process, and 
outputs the arithmetic result to the Acc 0 and Acci. After 
temporarily storing the result, the Acc 0 and Acci output the 
result to the shifter 11. The shifter 11 outputs the result 

10 as it is to the Acc 0 and Acci via the ALU 13 ' . The Acc 0 and Acci 
temporarily store the result again. 

Thereafter, the register 27 outputs the data in its storage 
to the combining circuit 29, and at the same time, the Acc 0 and 
Acci output the result derived by the ALU 13' to the combining 

15 circuit 29. 

Next, the combining circuit 29 combines the data stored 
in the register 27 and the result derived by the ALU 13' and 
stored in the Acc 0 and Acci, and outputs the combination result 
to the memory 17. Here, the combination is so done as to make 

20 8 least significant bits (LSB) as the data stored in the register 
27. The combination result is stored by the memory 17 into 
original addresses of the first and second memory blocks 19 and 
21. 

Described below is an arithmetic process in the ALU 13 ' . 
25 Figs. 17 and 18 are diagrams roughly showing the DMF algorithm. 
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In Figs. 17 and 18, parts parenthesized are taken as examples 
to describe an arithmetic operation in detail. 

First, thecontrol sectionreads 32 -bit data from addresses 
0 and 1 of the first and second memory blocks 19 and 21. The 
5 read data is output to the shifter 11 and the register 27 . Without 
shifting, the shifter 11 outputs the data as it is to the ALU 
13'. The register 27 temporarily stores the data read from 
addresses 0 and 1 of the first and second memory blocks 19 and 
21. Then, the ALU 13' executes an arithmetic process. 

10 In the process, the control section generates a division 

signal K based on the data structure. Based on thus generated 
division signal K, the ALU 13' divides the data read from addresses 
0 and 1 of the first and second memory blocks 19 and 21 to derive 
an input A 0 . As shown in Fig. 17, the ALU 13' then calculates 

15 Ai, Bi, Ci and Ci ' . Out of the arithmetic result thus derived, 
those found in the upper part of Fig. 17 roughly showing the 
DMF algorithm (e.g., Ci) are stored in the Acci of the accumulator 
15, and those in the lower part (e.g., Bi and Ci') are stored 
in the Acc 0 of the accumulator 15. In such a manner, the 

20 arithmetic result is stored in the Acc 0 and Acci of the accumulator 
15. 

The arithmetic result Ci is overwritten at the tail of 
the data corresponding to the delay value Di stored in the Acc 0 
and Acci of the accumulator 15. Because the data Di and D 2 are 
25 successive, the arithmetic result Ci will be input data of the 
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next delay value D 2 . 

Note here that, among the data read from addresses 0 and 
1 of the first and second memory blocks 19 and 21, non- arithmetic 
data is stored in the register 27 . This part of data is combined 
5 together with the arithmetic result by the combining circuit 
29. The combining circuit 29 outputs the combination result 
to the memory 17, and have the first and second memory blocks 
19 and 21 stored the result at addresses 0 and 1. Accordingly, 
in the present embodiment, there is no need to take time for 

10 arithmetic operation of non-arithmetic data, thereby easily 
generating the output data. Further, the unused data can be 
written out as it is at the time of overwriting. 

Referring to Fig. 18, the ALU 13' calculates A 2 , B 2 , C 2 , 
and C 2 ' . Out of the arithmetic result thus derived, those found 

15 in the upper part of Fig. 18 roughly showing the DMF algorithm 
(e.g., C 2 ) are stored in the Acci of the accumulator 15, and 
those in the lower part (e.g., B 2 and C 2 ' ) are stored in the 
Acc 0 of the accumulator 15. In such a manner, the arithmetic 
result is stored in the Acc 0 and Acci of the accumulator 15. 

2 0 The arithmetic result C 2 is overwritten at the tail of 

the data corresponding to the delay value D 2 in the Acc 0 and 
Acci of the accumulator 15. Because the data D 2 and D 3 are 
successive, the arithmetic result C 2 will be input data of the 
next delay value D 3 . 

25 In such a manner, the ALU 13' executes similar operation 
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successively, finally deriving the arithmetic result. 

Thereafter, the control section moves the reference point 
of Fig. 16 to the left by one to make the data stored in the 
memory 17 cyclically proceed. Then, the next arithmetic 
5 operation is executed. 

As described in detail in the foregoing, by simply 
including the predetermined-bit register 27 and the combining 
circuit 29 , the present invention successfully achieves effects 
of reducing the memory usage amount during DMF process. Thus, 

10 if utilized in the DSP used for the third generation mobile phone , 
the present invention can reduce the memory amount from 510 words 
to 3 84 words, that is, achieve reduction of 12 6 words. 

Moreover, the present invention does not require circuit 
increase, achieving such effects with less cost. 

15 While the invention has been described in detail, the 

foregoing description is in all aspects illustrative and not 
restrictive. It is understood that numerous other 
modifications and variations can be devised without departing 
from the scope of the invention . For example , in the embodiment , 

2 0 the memory 17 may be structured by 3 2 -bit memory. Further, the 
present invention is applicable not only to the DSP but also 
to any devices executing processing with respect to 9-bit to 
12 -bit data. 

As described above, by simply including the 

25 predetermined-bit register 27 and the combining circuit 29, the 
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present invention can successfully achieve effects of reducing 
the memory usage amount during DMF process. 

In the present invention, a data reading method may be 
claimed. In an arithmetic circuit including an arithmetic logic 
5 unit for executing a predetermined arithmetic operation and 
memory for data storage, following steps would be comprised. 
Data are read by 2 n bits from the memory including first and 
secondmemory blocks . The read data is divided into an arithmetic 
part to be used for an arithmetic process and a non-arithmetic 

10 part not to be used therefor. In the reading step, data reading 
is done, in a predetermined order, from the first and second 
memory blocks in the same stage, and from the second memory block 
and the firs t memory block in a subsequent stage . In the dividing 
step , the data read from the memory is divided into the arithmetic 

15 part and the non-arithmetic part by shifting the non-arithmetic 
part by a predetermined number of bits every time an arithmetic 
operation is executed. 
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